Enterprise Database Systems
DevOps for Data Scientists
DevOps for Data Scientists: Containers for Data Science
DevOps for Data Scientists: Data DevOps Concepts
DevOps for Data Scientists: Data Science DevOps
DevOps for Data Scientists: Deploying Data DevOps

DevOps for Data Scientists: Containers for Data Science

Course Number:
it_dsdods_04_enus
Lesson Objectives

DevOps for Data Scientists: Containers for Data Science

  • discover the key concepts covered in this course
  • describe the use of containers for data science
  • describe approaches to infrastructure as code for data deployment
  • describe Ansible and Vagrant approaches to data science deployment
  • describe provisioning tools used in data science
  • use Docker to build a data model
  • use Docker to perform model testing for deployment
  • use Docker to manage R deployments
  • use Docker for a PostgreSQL deployment
  • create a Docker persistent volume
  • use Jupyter Docker Stacks to get up and running with Jupyter
  • use the Anaconda distribution to run a Jupyter Notebook
  • use Jupyter Notebooks with a Cookiecutter data science project
  • use Docker Compose with PostgreSQL and Jupyter Notebooks
  • use a container deployment for Jupyter Notebooks with R
  • use a container strategy for a Jupyter deployment

Overview/Description

In this 16-video course, explore the use of containers in deploying data science solutions by using Docker with R, Python, Jupyter, and Anaconda. Begin with an introduction to containers and their use for deployment and data science. Then examine approaches to infrastructure as code for data deployment, and concepts behind Ansible and Vagrant approaches to data science deployment. Explore the main features of provisioning tools used in data science. You will learn how to use Docker to build data models, then use it to perform model testing for deployment, to manage R deployments, and for a PostgreSQL deployment. Also, discover how to use Docker for persistent volumes. Next, learners look at using Jupyter Docker Stacks to get up and running with Jupyter and using the Anaconda Distribution to run a Jupyter Notebook. This leads into using Jupyter Notebooks with a Cookiecutter data science project. Then learn about using Docker Compose with PostgreSQL and Jupyter Notebook, and using a container deployment for Jupyter Notebooks with R. The concluding exercise involves deploying Jupyter.



Target

Prerequisites: none

DevOps for Data Scientists: Data DevOps Concepts

Course Number:
it_dsdods_01_enus
Lesson Objectives

DevOps for Data Scientists: Data DevOps Concepts

  • discover the subject areas covered in this course
  • define the use and application of DevOps for data science and machine learning
  • describe topological considerations for data science and DevOps
  • apply high-level organizational and cultural strategies for data science with DevOps
  • describe the specific day-to-day tasks of DevOps for data science
  • assess technological risks and uncertainties when implementing DevOps for data science
  • describe scaling approaches to data science using DevOps
  • identify how DevOps can improve communication for data science workflows
  • identify how DevOps can help overcome ad hoc approaches to data science
  • describe considerations for ETL pipeline workflow improvements through DevOps
  • describe the microservice approach to machine learning
  • create a diagram of your data science infrastructure

Overview/Description

To carry out DevOps for data science, you need to extend the ideas of DevOps to be compatible with the processes of data science and machine learning (ML). In this 12-video course, learners explore the concepts behind integrating data and DevOps. Begin by looking at applications of DevOps for data science and ML. Then examine topological considerations for data science and DevOps. This leads into applying the high-level organizational and cultural strategies for data science with DevOps, and taking a look at day-to-day tasks of DevOps for data science. Examine the technological risks and uncertainties when implementing DevOps for data science and scaling approaches to data science in terms of DevOps computing elements. Learn how DevOps can improve communication for data science workflows and how it can also help overcome ad hoc approaches to data science. The considerations for ETL (Extract, Transform, and Load) pipeline workload improvements through DevOps and the microservice approach to ML are also covered. The exercise involves creating a diagram of data science infrastructure.



Target

Prerequisites: none

DevOps for Data Scientists: Data Science DevOps

Course Number:
it_dsdods_02_enus
Lesson Objectives

DevOps for Data Scientists: Data Science DevOps

  • discover the subject areas covered in this course
  • examine a Cookiecutter project structure
  • modify a Cookiecutter project to train and test a model
  • describe the steps in the data model life cycle
  • describe the benefits of version control for data science
  • describe tools and approaches to continuous integration for data models
  • describe approaches to data and model security for Data DevOps
  • describe approaches to automated model testing for Data DevOps
  • identify Data DevOps considerations for data science tools and IDEs
  • identify approaches to monitoring data models
  • describe approaches to logging for data models
  • identify ways to measure model performance in production
  • add directives to the make file to prepare for continuous integration
  • implement a data integration task with Jenkins
  • implement data integration with Travis CI
  • incorporate a model into a Cookiecutter project

Overview/Description

In this 16-video course, learners discover the steps involved in applying DevOps to data science, including integration, packings, deployment, monitoring, and logging. You will begin by learning how to install a Cookiecutter project for data science, then look at its structure, and discover how to modify a Cookiecutter project to train and test a model. Examine the steps in the data model lifecycle and the benefits of version control for data science. Explore the tools and approaches to continuous integration for data models, to data and model security for Data DevOps, and the approaches to automated model testing for Data DevOps. Learn about the Data DevOps considerations for data science tools and IDEs (integrated developer environment) and the approaches to monitoring data models and logging for data models. You will examine ways to measure model performance in production and look at data integration with Cookiecutter. Then learn how to implement a data integration task with both Jenkins and Travis CI (continuous integration). The concluding exercise involves implementing a Cookiecutter project.



Target

Prerequisites: none

DevOps for Data Scientists: Deploying Data DevOps

Course Number:
it_dsdods_03_enus
Lesson Objectives

DevOps for Data Scientists: Deploying Data DevOps

  • Course Overview
  • serialize models using Python and pickle
  • describe tools and approaches to model packaging and deployment
  • describe the blue-green deployment strategy for Data DevOps
  • describe the canary deployment strategy for Data DevOps
  • describe approaches to rolling back model versions
  • explore approaches to deploying models to web APIs
  • use python and pandas to serialize a model

Overview/Description

In this course, learners will explore deploying data models into production through serialization, packaging, deployment, and rollback. You will begin by watching how to serialize models using Python and Pandas. Then the 8-video course takes a look at the tools and approaches to model packaging and deployment. Next, you will explore the concept of the blue-green deployment strategy for data DevOps, which is the strategy for upgrading running software. This leads into examining the concepts behind the Canary deployment strategy in terms of data DevOps. Canary deployments can be regarded as a phase or test rollout on updates and new features. Then take a look at versioning and approaches to rolling back models for machine learning with DevOps. Finally, you will learn about some of the considerations for deploying models to web APIs (application programming interfaces). The concluding exercise involves creating a model by using Python and Pandas, then serializing the results of the model to a file.



Target

Prerequisites: none

Close Chat Live